Skip to content

[ENH] Add config validation with descriptive error messages#529

Open
Saurabh6266 wants to merge 1 commit intomllam:mainfrom
Saurabh6266:enh/config-validation-error-messages
Open

[ENH] Add config validation with descriptive error messages#529
Saurabh6266 wants to merge 1 commit intomllam:mainfrom
Saurabh6266:enh/config-validation-error-messages

Conversation

@Saurabh6266
Copy link
Copy Markdown

@Saurabh6266 Saurabh6266 commented Mar 27, 2026

Summary

Closes #528

Adds a validate_config() function to neural_lam/config.py that checks for common config problems before training or graph creation begins, and raises InvalidConfigError with clear, actionable messages instead of raw tracebacks.

Motivation

Two common config errors currently produce hard-to-debug tracebacks:

  1. datastore.config_path pointing to a non-existent file causes a deep FileNotFoundError inside init_datastore, with no indication of which config field is wrong.
  2. ManualStateFeatureWeighting with an empty weights dict produces a silent no-op during training rather than an early error.

Both are now caught at startup with a message pointing directly to the relevant field and showing an example fix.

Changes

  • neural_lam/config.py: Added validate_config(config, config_path) called inside load_config_and_datastore after YAML parsing
  • tests/test_config.py: Added five tests for validate_config covering existing path (passes), missing path (raises with correct field name), error message content, empty ManualStateFeatureWeighting (raises), and non-empty ManualStateFeatureWeighting (passes). Existing tests untouched.

Why this matters for future development

The probabilistic forecasting project and other extensions will introduce new required config fields. Having a single validation function makes it straightforward to add new checks without risk of new fields producing cryptic runtime errors deep in the training loop.

Testing

pytest tests/test_config.py -v
pre-commit run --all-files

Checklist

  • Tests pass locally
  • pre-commit passes
  • Existing tests unaffected

- Added validate_config() to neural_lam/config.py
- Checks datastore.config_path exists on disk before training starts
- Checks ManualStateFeatureWeighting.weights is non-empty
- Raises InvalidConfigError with actionable message pointing to the field
- Added 5 tests to tests/test_config.py covering all validation paths

Closes mllam#528
@princekumarlahon
Copy link
Copy Markdown

This is a really nice improvement catching config issues early with clear error messages will make things much easier to debug, especially for new users. I also like that validation happens right after loading the config and that you added targeted tests for the new cases. I had a few thoughts mainly around how this might scale as more config options get added:

  1. Right now all the checks seem to live inside validate_config. That works well for now, but I wonder if it might get a bit crowded over time. Maybe grouping validations by sections (like datastore, training, etc.) could help keep things organized as it grows.

  2. It might also be worth standardizing the error messages a bit more for example, always including the full config path (training.state_feature_weighting) and keeping a consistent format for suggested fixes.

  3. Since this is meant to support future additions (like probabilistic forecasting configs), it could be useful to have a clear pattern for adding new validation rules, so contributors don’t have to modify one large function every time.

Out of curiosity, do you plan to expand validation to more fields over time, or keep it focused on the most common failure cases? Overall though, this is a great step toward making config errors much more user-friendly.

@Saurabh6266
Copy link
Copy Markdown
Author

Thanks @princekumarlahon, really appreciate the detailed feedback!

These are all good points. To respond to each:

  1. Grouping by section — agreed that a flat validate_config will get crowded as more fields are added. One approach would be breaking it into _validate_datastore and _validate_training helpers called from the main function, so each section stays self-contained. Happy to restructure it that way in a follow-up commit if that direction makes sense to the maintainers.

  2. Consistent error message format — makes sense. Right now the path format is consistent (datastore.config_path, training.state_feature_weighting) but the suggested fix wording varies slightly between checks. I can tighten that into a fixed template — something like [field]: <what's wrong>. Example fix: <yaml snippet> — so all messages follow the same structure.

  3. Extensibility pattern — the intent was exactly this: new fields get registered in one place with their error description, rather than buried in ad-hoc conditionals. If the current structure isn't clear enough on that, I can add a short comment above the validation calls explaining the pattern.

On your question about scope: the plan is to keep the initial PR focused on the two most failure-prone cases (missing datastore file, empty manual weights) and expand incrementally as new config fields are introduced — especially the ones needed for the probabilistic forecasting extension. That way each addition stays reviewable and testable on its own.

Let me know if you'd like me to address any of these before review, or if it's better to handle them as follow-up PRs.

@princekumarlahon
Copy link
Copy Markdown

This sounds great I like the direction. Splitting validation into section-based helpers and standardizing error messages both make sense. And +1 to keeping this PR focused and handling structural improvements as follow up. I think this is in a good place

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENH] Improve config validation error messages for missing or invalid fields

2 participants